Pathways in Two-State Protein Folding 
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The thermodynamics of proteins indicate that folding/unfolding takes place either through stable 
intermediates or through a two-state process without intermediates. The rather short folding times of 
the two-state process indicate that folding is guided. We reconcile these two seemingly contradictory 
observations quantitatively in a schematic model of protein folding. We propose a new dynamical 
transition temperature which is lower than the thermodynamic one, in qualitative agreement with 
in vivo measurement of protein stability using E.coli. Finally we demonstrate that our framework 
is easily generalized to encompass cold unfolding, and make predictions that relate the sharpness of 
the cold and hot unfolding transitions. 
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^ ■ Proteins fold to a uniquely denned ground state, and do this in spite of the astronomical number of possible states. 
This paradox, usually attributed to Levinthal, is further sharpened in view of the fact that there is thermodynamic 
evidence for the folding transition behaving nearly as for a two-state system for a large sub-class of single domain 
proteins [0-|4j. One would think that an on-off process would exclude the possibility of guiding, and indeed simple 
guiding predicts a first order phase transition which is far softer than experimentally found for the two-state class 
0^ \ of proteins. The purpose of this letter is to quantify the degree of guiding that is compatible with the observed 
two-state folding process. We do this through generalizing a hierarchical protein model introduced earlier in Ref. [g| . 
In this model guiding dominates the dynamics of the folding process, which in this frame is defined through the Monte 
Carlo (MC) procedure applied when simulating the stochastic behavior of the model. We find here that guiding can 
dominate the dynamics of folding and still maintain the thermodynamic behaviour as that of a two-state process. 
*^ ■ Another prediction of our scenario is that the cold unfolding transition |||| should exhibit a sharpness close to 
that of the hot unfolding transition. To our knowledge there is no experimental studies of the sharpness of the cold 
unfolding transition. 

The van't Hoff relation §, 



.2 AC* 



AH = akT^—, (1) 

provides a powerful way to quantify the sharpness of a first order phase transition taking place at T c . It relates the 
enthalpy difference between the two phases, AH to the height of the heat capacity peak, AC and latent heat of the 
transition Q with a proportionality factor a. A smaller a corresponds to a sharper transition. When the transition is 
two state, a = 4, and when the transition has a large number of equally stable intermediates, a = 12. For the single 
domain proteins, ribonuclease, lysozyme, chymotrypsin, cytochrome c and myoglobin, Privalov and Khechinasvili jl[] 
find experimentally 

a = 4.2 (2) 

to within 5 % accuracy, demonstrating that these transitions are very nearly two-state. Other small proteins like 
a-lactalbumin, RNase H, barnase and cyt c have known metastable intermediates j^]. 
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The two extremes of protein folding is spanned by respectively the two-state and a multiple state "zipper" -like 
j?],D description of the process ||. We sketch the model and its parametrization briefly here. The relevant degrees 
of freedom (conformational angles) are modeled through binary variables ipi. They either locally match the ordered 
structure: ipi = 1, or they do not: ipi = 0. Guiding is imposed through the series of inequalities 

ipi > ipi+i ■ (3) 

The variables ipi alone cannot describe the degrees of freedom that become liberated when a portion of the protein 
is not matching the local structure of the native state. In order to take these extra degrees of freedom into account, 
a second, independent set of variables, is introduced. For simplicity, these variables are assigned the values 1 or 
1 — E. The Hamiltonian is 

N 

H = -X>& > ( 4 ) 

i=l 

with the constraints ([}]) in effect. The interpretation of the terms in this Hamiltonian is that when there is a local 
match, ipi = 1, there is a energy cost of E to change the variable. When there is no match, there is no energy cost 
associated with changing £j — it "flaps" freely. 

We note that for any finite value of E, the protein may change structure locally due to change in £j even in the 
parts of the protein where ipi — 1. In order to simplify the analysis, we assume E to be sufficiently large compared 
to any other energy scale in the system — in particular kT, where T is the temperature — so that the variables 
never take the value 1 — E when ipi = 1. 

We may define a set of binary, unconstrained variables ipi, taking the values zero or one such that 

ipi = (pi ■ ■ ■ tpi . (5) 
In particular, ip\ = ip\. In the limit when E — > oo, the Hamiltonian (Q) becomes 

H pl = - (fix - (fii(fi 2 ~ tpitp2<P3 H <fl<P2 • ■ ' <PN , (6) 

where there are no additional constraints. The role of the variables £j is now played by the degeneracy present in (|^). 

One can easily show that (|^) has a first order phase transition at T c = l/log(2) where the ordered phase {<Pi} = 
{1111 •• -1} with energy U — — N melts to a disordered structure with energy U w 0. Thus AH = Q = N and 
AC = N 2 log(2) 2 /12 leading to a = 12. On the other hand if we only consider a rescaled last term 

H p2 = -N<pxip2 ■ ■ -IPN , (7) 

then one also obtains a phase transition at T c = l/log(2), with AH = Q = N but with AC = TV 2 log(2) 2 /4. Thus in 
this case a = 4. There is no guiding in the Hamiltonian (Q) since the ground state, {1111 ■ ■ • 111}, is one out of the 
2 N possible states, while all the other 2^ — 1 states are degenerate. 

We define time in the model based on the MC method || . The values of <pi are chosen or changed randomly, and 
acceptance of each choice depends upon the usual Boltzmann factor due to any energy shift connected to this. Time 
advances by one unit for every attempted update of the <pi variables. We note, however, that the dynamics of an 
MC procedure may be different from the actual dynamics of a given Hamiltonian, although properties at thermal 
equilibrium are properly represented. 

The average folding time measured as the typical number of states visited before finding the ground state is widely 
different in the two models. For the true two-state model (Q) the average folding time is 2 N /2. For the guided system 
governed by (^|) the ground state is found in a time growing as N 2 when T is below T c . To reconcile that a large class 
of proteins behave as a two-state system with the necessity of being able to reach the ground state in a reasonable 
time, we now study a combination of the two Hamiltonians (Q) and (Q) 

Hp = XpHpi + (1 — \p)Hp2 . (8) 

This Hamiltonian has a transition at T c = l/log(2) for all values of X p . The behaviour can be parametrized by 
the smallest n for which Lp n +i — 0. For a given temperature the partial free energy of states characterized by n is 
F{n) — n (Tlog(2) — X p ) — 6n,u (1 — A p ) N. In Fig. |l] we show F(n) schematically for different temperatures T. 
Each F(n) exhibits a jump at n = N corresponding to the energy gain JV(1 — X p ) for reaching the ground state. At 
low T, F(n) is monotonically decreasing, reflecting a fast folding kinetics where the typical folding time grows as N 2 . 
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At an intermediate T — Tq = X p / log(2) all n < N are equally probable. For T in the interval between Tq and T c 
the intermediate states are unstable (see Fig. [j] ) — i.e. they form a barrier between the folded and denatured state 
- and the folding time scale exponentially with both T and N. At a higher T = T c = l/log(2) the folded state 
becomes unstable, and the protein melts ((n) w 0). The fact that the free energy landscape changes with T means 
that two-state folding around T c is compatible with guiding and fast folding at low T. 

Fig. [2] shows the van't Hoff cocfhcient a as a function of X p on the unit interval based on direct calculation of the 
partition function. One observes that increasing A p — i.e. increasing the guiding — leads to increasing a and thus a 
softening of the transition. As N is increased, the regime where a is very close to 4 is expanded towards higher values 
of A p . For example, with the experimental observation of a = 4.2, and assuming N = 10, A p is close to zero while for 
N = 100, A p is approximately 0.7. Thus, in this latter case, 70% of the energy difference between the unfolded and 
folded states sits in the guiding, and still a is very close to the value indicating the folding process to be essentially a 
two-state process. 

We now discuss the fact that large N allows for more guiding without destroying the two-state nature of the 
transition. To understand this we note that any A p < 1 in fact define a virtual phase transition at T = Tq < T c . 
At Tq the protein would melt if it were not due to the additional gain in binding energy when the ground state 
is reached. This virtual transition is not seen directly in equillibrium thermodynamics, but strongly influences the 
dynamic behaviour in the temperature range between Tq and T c : In this intermediate regime the protein is a two- 
state system where occasionally melting implies a long waiting time with many partial refreeze attempts. Due to 
fluctuations a system with small N can be partly refrozen also above the virtual transition, and thus a depends on 
system size as shown in fig. 0. 

Experimentally, if one are dependent on dynamics one presumably measure Tq as the transition temperature, while 
for experiments based on thermodynamics it would be T c . For fast living organisms such as E.coli the overall status 
of fraction of unfolded proteins can be monitored by the level of chaperone DnaK JTo| , pr| . For temperatures between 
13 and 37 C the DnaK per E.coli cell raises slowly from 4000 to 6000, whereafter it raises sharply to ~ 8500 at 42 C 
and ~ 18000 at 46 C |lj,[ll|. At 50 C the E.coli dies. This may be taken as an indication that in the temperature 
interval above 37 C the typical proteins need help in the folding process. But as the cell is able to sustain life up to 
about 50 C, the typical proteins must have some stability up to this higher temperature. This is reminiscent of the 
behaviour of our model, with a Tq of about 37 C, an exponentially slow folding of proteins, necessitating the help of 
chaperones, for higher temperatures and a T c of the order of 50 C [ fL4| . 

The above considerations can be extended to include a more realistic scenario where the protein is reacting with 
water. Following ref. || we parametrize this through water variables w\,W2, ...,wn, taking values £ m i n + sA, s — 
0, 1, g — 1. Here, A is the spacing of the energy levels of the water-protein interactions. We quantify the coupling 
to the water by a combination of the Hamiltonians 

H wl = (1 - ipi)wi + (1 - <pi<p2)w 2 + ... + (1 - <pi<f2 ■ ■ ■ Pn)w n , (9) 

and 

H w2 = (1 - ipi if2 ■■■if n)(wx-\ hwjv), (10) 

to form the total Hamiltonian 

H = XpHpi + (1 — Xp)H P 2 + X w H w i + (1 — X W )H W 2 ■ (11) 

(Here it may be noted that H W 2 may introduce non-local interactions between distant units, when the terms are 
interpreted using the variables ipi and When A p = X w = 1 we are back to the Hamiltonian defined in ref. || 
whereas when A p = X w = we are facing a two-state Hamiltonian. In Fig. |^ we display the heat capacity curves for 
these two extremes. The system is folded in its ground state between the cold unfolding transition at T = 1.2 and 
the hot unfolding transition at T = 4.7 As also quantified by the van't Hoff coefficients, we see that the Hamiltonian 
without guiding gives a phase transition which is about a factor 3 sharper for both the cold and the hot unfolding 
transitions. Also in terms of temperature, these transitions are much more separated than in real systems where the 
freezing of water will look much more like "absolute zero" . The present model as it stands is not able to account for 
this. 

In Fig. |^ we investigate systematically the van't Hoff coefficient a as function of X p and A^ for the hot (Fig. ^ 
a) and the cold (Fig. |^ b) transition. As is evident, a is similar but somewhat larger for the hot than for the cold 
transition. As a consequence, the cold transition transition is slightly sharper. We are not aware of any experimental 
measurements of the van't Hoff coefficient for the cold transition. Such a measurement will, however, in practice be 
hampered as the cold transition is mainly seen experimentally at pH-values where it is close to the hot transition. 



3 



Finally, we note the distinct feature of the cold transition a when (A p , X w ) « (1,0) where it drops to a value below 
4. We will show in |l5j that this is a remnant of the phase transition that would have appeared at T — T c — 1/ log(2) 
if the system were not coupled to the water. 

We summarize by noting that in this protein model, it is easy to reconcile the thermodynamics of a two-state 
system with the dynamics of a guided system, as this can be done by diminishing either A p and/or X w from the value 
one. The dynamical consequence of the hereby masked guiding is a folding times that is dramatically reduced when 
temperature moving away from the transition temperature. 

We note as final consequence of our model that good folders can be viewed as random sequences of folding steps of 
which the last have a particularly favorable binding energy thereby securing two state cooperativity. 
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Nazarcno for warm hospitality and the I.C.C.M.P. for support during our stay in Brazil. We thank G. Zocchi for 
countless discussions. 
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FIG. 1. A schematical drawing of the partial free energy F(n) as function of the level of folding n for for different temperatures 

T. 



FIG. 2. The van't Hoff coefficient a as a function of X p for N = 10 and 100. 

FIG. 3. Heat capacity curves for N = 50 system with and without guiding, i.e. with \ p = X w = 1 respectively X p — X w = 0. 
The parameters for the water variables are e = —3.1, A = 0.04 and g — 350. 

FIG. 4. van't Hoff coefficient a for a) hot respectively b) cold transition for N = 100 system. Other parameters are as in 
Fig. 2. 
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Figure 1 
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